{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "b6319a2c-c0ea-4c26-8b07-0954ee46aa62",
   "metadata": {},
   "source": [
    "# Immune Signatures of the First Living Human Recipient of a Gene-Edited Pig Kidney (Single Cell Preprocess coding)\n",
    "\n",
    "*Authors: Guilherme T. Ribas, André F. Cunha, Jonathan P. Avila, Alessia Giarraputo, Leela Morena, Karina Lima, Rodrigo B. Gassen, Jia-Yun Chen, Jia-Ren Lin, Sandro Santagata, Claire T. Avillach, Birgitta A. Ryback, Martin S. Lindner, Sivan Bercovici, Ivy A. Rosales, Tatsuo Kawai, Helder I. Nakaya, R. B. Colvin, Thiago J. Borges, Leonardo V. Riella*"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b18fbb7d-8cfb-453f-ba7b-988936b8ec0e",
   "metadata": {},
   "source": [
    "## Create folders\n",
    "Assuming that raw data is in **./data/raw** and the execution will be run in **./scripts** directory"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "d1022029-4e9d-460e-b121-917b4c88fca8",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "mkdir scripts # All scripts are here\n",
    "mkdir ref_files # All reference files and indexes are here\n",
    "mkdir outputs # All the outputs, including the final quantification"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "18b1cf77-bce8-47b9-b3f7-1d9838f7db9b",
   "metadata": {},
   "source": [
    "## Download reference files\n",
    "download to the ref_files folder"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "92f5112b-0370-46ea-acd3-6a493339c5a6",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "wget https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_46/gencode.v46.basic.annotation.gtf.gz\n",
    "wget https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_46/GRCh38.p14.genome.fa.gz\n",
    "\n",
    "## This is the whitelist bar code from 10xGenomics. I downloaded using cellranger 8.0.0\n",
    "cellranger-8.0.0/cellranger-cs/8.0.0/lib/python/cellranger/barcodes/\n",
    "## But it can be downloaded here\n",
    "# wget https://github.com/f0t1h/3M-february-2018/blob/master/3M-february-2018.txt.gz\n",
    "# or\n",
    "# wget https://github.com/pachterlab/kite/blob/master/docs/3M-february-2018.txt.gz"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "37e568ea-d639-4893-b7c7-f0164c45fcb7",
   "metadata": {},
   "source": [
    "## Build extended reference index"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0d2274b8-d78e-47f0-98bc-a8f65adf601c",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "# run_pyroe_make_splice.sh\n",
    "pyroe make-splici ../ref_files/GRCh38.p14.genome.fa ../ref_files/gencode.v46.basic.annotation.gtf 90 ../ref_files/pyroe_ext_ref/\n",
    "\n",
    "# run_salmon_index.sh\n",
    "salmon index -t ../ref_files/pyroe_ext_ref/splici_fl85.fa -i ../ref_files/salmon_index -p 8"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a1bbfad7-f32f-42c8-a62f-de7c173d8ddd",
   "metadata": {},
   "source": [
    "## Perform mapping and quantification"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "048e5ee7-1327-43d8-a1a3-93e8a04cb8fe",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "# run_mapping_sa.sh\n",
    "files='RGc19_S11 RGc18_S10 RGc16_S8 RGc20_S12 RGc14_S6 RGc15_S7 RGc17_S9'\n",
    "for f in $files; do\n",
    "    echo $f;\n",
    "    salmon alevin -i ../ref_files/salmon_index -l ISR -1 ../data/raw/${f}_R1_001.fastq.gz -2 ../data/raw/${f}_R2_001.fastq.gz -p 8 -o ../outputs/salmon_alevin/${f}/ --chromiumV3 --sketch;\n",
    "    echo ' ';\n",
    "done"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "34303139-307e-4766-bcca-02b200ca33cf",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "# run_af_permit_list.sh\n",
    "files='RGc19_S11 RGc18_S10 RGc16_S8 RGc20_S12 RGc14_S6 RGc15_S7 RGc17_S9'\n",
    "for f in $files; do\n",
    "    echo $f;\n",
    "    alevin-fry generate-permit-list --input ../outputs/salmon_alevin/${f}/ --expected-ori fw --output-dir ../outputs/alevin_fry_gpl/${f} --unfiltered-pl ../ref_files/3M-february-2018.cleaned.txt;\n",
    "    echo ' ';\n",
    "done\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4f05d661-c527-4665-97b6-2adb22471f8f",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "# run_af_collate.sh\n",
    "files='RGc19_S11 RGc18_S10 RGc16_S8 RGc20_S12 RGc14_S6 RGc15_S7 RGc17_S9'\n",
    "for f in $files; do\n",
    "    echo $f;\n",
    "    alevin-fry collate -i ../outputs/alevin_fry_gpl/${f} -r ../outputs/salmon_alevin/${f}/ -t 8\n",
    "    echo '';\n",
    "done"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9afbab2d-fd56-4562-911b-643aad2edf72",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "# run_af_quant.sh\n",
    "files='RGc19_S11 RGc18_S10 RGc16_S8 RGc20_S12 RGc14_S6 RGc15_S7 RGc17_S9'\n",
    "for f in $files; do\n",
    "    echo $f;\n",
    "    alevin-fry quant -r cr-like -m ../ref_files/pyroe_ext_ref/splici_fl85_t2g_3col.tsv -i ../outputs/alevin_fry_gpl/${f} -o ../outputs/alevin_fry_quant/${f} -t 8;\n",
    "    echo ' ';\n",
    "done"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "94501ccc-623a-4f68-888d-784f9894c4e8",
   "metadata": {},
   "source": [
    "# Downstream analysis can be found at the file: **xeno_code_scRNA_paper.zip** \n",
    "\n",
    "https://doi.org/10.7910/DVN/HMKPXS"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:base] *",
   "language": "python",
   "name": "conda-base-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
